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Abstract 

O 

We analyze the achievable rate of the superposition of block Markov encoding (decode-and-forward) 

in 

and side information encoding (compress-and-forward) for the three-node Gaussian relay channel. It is 
generally believed that the superposition can out perform decode-and-forward or compress-and-forward 
due to its generality. We prove that within the class of Gaussian distributions, this is not the case: 
q i the superposition scheme only achieves a rate that is equal to the maximum of the rates achieved by 

decode-and-forward or compress-and-forward individually. We also present a superposition scheme that 
combines broadcast with decode-and-forward, which even though does not achieve a higher rate than 
t~^. , decode-and-forward, provides us the insight to the main result mentioned above. 

o 

cn 

■ Index Terms 



Relay Channel, Achievability, Superposition encoding, Gaussian relay capacity. 



I. Introduction 



The relay channel, introduced by van der Meulen [1] is a fundamental building block in 
network information theory. It consists of a relay terminal assisting communication between a 
source terminal and a destination terminal, facilitating a higher data rate than that of a point to 
point channel. Cover and El Gamal [2] introduced two new coding strategies and a cut-set upper 
bound for the relay channel. They derived the capacity of the degraded and reversely degraded 
relay channels. Capacity results have been derived for special cases of the relay channel like the 
semi-deterministic case [3] but the capacity of the general relay channel is still unknown. 
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The main achievability strategies known for the relay channel are Decode and Forward (DF) 
and Compress and Forward (CF) [2]. The DF scheme is also known as the general block Markov 
encoding scheme. The relay decodes the transmitted message and jointly transmits the message 
from the source to the destination terminal. The DF strategy is optimal and achieves the cut-set 
bound when the source to relay link channel is strong. The CF scheme is known as the side- 
information encoding scheme. The relay compresses the received signal without decoding and 
transmits to the destination terminal. The destination terminal treats the compressed information 
as side information and decodes the original message. The CF scheme is asymptotically optimum 
and achieves the cut-set bound when the relay to destination link channel is strong, so that the 
received signal at the relay can be conveyed faithfully to the destination. A combination of the 
two strategies that superimposes DF and CF was also proposed in [2, Theorem 7]. Hereafter we 
refer to this scheme as the superposition forwarding (SF). The SF scheme achieves the capacity 
for the special cases of degraded, reversely degraded and semi-deterministic relay channels. Due 
to the generality of the result in [2, Theorem 7], it is expected it can offer higher achievable 
rates than DF or CF alone. 

In this paper, we investigate the coding scheme for the general Gaussian relay channel. The 
initial motivation for the work was to develop new coding strategies with higher achievable rates. 
A new coding strategy was designed which superimposes Decode and Forward and Broadcast, 
as presented in Section III. The scheme unfortunately yields a rate that is inferior to DF. This 
attempt, though not successful, prompted us to investigate the general superiority of SF, especially 
for the Gaussian relay channel. It is found that for Gaussian relay channel, within the class of 
Gaussian distributions, the SF can achieve at most the larger rate achievable by DF or CF alone 
— there is no need to do superposition for Gaussian distributions (Section IV). We also provide 
one numerical example that verified the theoretical result in Section V. Section VI concludes 
the paper. 

Notation: For random variables X, Y, Z, we use p(x, y, z) to denote the joint distribution, 
when there is no confusion, as a short cut to px,Y,z(x, V, z). When X and Z are conditionally 
independent given Y (i.e., X, Y, and Z form a Markov chain), we write X — Y — Z. 
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II. Preliminaries 

We present the mathematical models for the discrete-memoryless and Gaussian relay channels 
in this section, and also include the known results on achievable rates that will be used later. 

A. Discrete memoryless relay channel 

The general discrete memoryless relay channel (DMRC) is the same as defined in [2]. A brief 
description is given here for completeness. The DMRC is denoted by [X\ x X 2 ,p{y 2 , y^\%\-> X2), 
y*2 x 3^3), where X 1: X 2 , ^2, 3^3 are finite sets and p(., .\x\, x 2 ) is a collection of probability 
distributions on y 2 x 3^3, one for each (x\, x 2 ) G X\ x X 2 ; x\ and x 2 are the transmitted symbols 
at the source and the relay respectively; y 2 and y 3 are the received symbols at the relay and the 
destination terminal. 

An (M, n) code for the relay channel consists of a set of integers Ai = {1,2,..., M}, an 
encoding function x\ : M. — > X™ a set of relay functions {/i}" =1 such that 

x 2i = fi (Y 21 , Y 22 , Y{2i - 1)) , 1 < i < n, 

and a decoding function g : yg — > M.. The joint probability mass function on M. x X™ x X 2 x 

y? X y* is 

n 

p(w,x u x 2 ,y 2 ,y 3 ) = p(w)Y[p(x li \w)p(x 2i \y 21 ,y 22 , . . . , y 2i ^)p(y 2i , y Zi \x u , x 2i ). (1) 

i=l 

Define X(w) = p(g(Y) 7^ w) as the probability of error of the decoding function of the relay 
channel and let A n be the maximal probability of error over all possible messages w. The rate 
R = (1/n) logM of an (M, n) code is said to be achievable by a relay channel if for any e > 
and for sufficiently large n, there exists a code with M > 2 nR such that A n < e. 

B. Gaussian relay channel 

Fig. 1 shows the Gaussian relay channel model that we will be using. The received symbols 
at the relay and the destination terminal are given respectively by 

Y 2 = aX. + Z, (2) 

Y 3 = X 1 + bX 2 + Z 2 (3) 
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where the noise terms Z x and Z 2 are uncorrelated zero mean Gaussian random variables with 
variances N x and N 2 respectively, and a and b are the channel gain constants. As a result, we 
have 



(y 2 - axxf (y 3 -x x - bx 2 f 



2N X 2N 2 
which will be the channel assumed throughout the paper. 
The average power constraints at the transmitters are 



(4) 



1 

-J24i(k)<Pi, VkeM, (5) 

i=l 



and 



n 



n 
i=i 



(6) 



C. Known achievable rates 



We briefly review the known results in for DF, CF, and the SF. For DMRC, the DF scheme 
achieves any rate less than [2, Theorem 1] 

R DF = supmm{/(X 1 ;F 2 |X 2 ),/(X 1 ,X 2 ;F 3 )} (7) 

where the supremum is taken over all possible p(xx,x 2 ). The CF scheme achieves any rate less 
than [2, Theorem 6] 

R CF = sup J(X i; Y 2 , Y 3 \X 2 ), such that I(X 2 ; Y 3 ) > I(Y 2 ; Y 2 \X 2 , Y 3 ) (8) 
where supremum is taken over all joint probability distributions of the form 

p(x u x 2 ,y 2 ,y 3 ,y 2 ) = p{x 1 )p{x 2 )p{y 2 ,y 3 \x 1 , x 2 )p(y 2 \y u x 2 ). (9) 

El Gamal, Mohseni, and Zahedi [4] put forth an equivalent characterization of the CF scheme. 
That is, it achieves any rate less than 

Rcf = sup min{/(I i; %, Y 3 \X 2 ), I(X U X 2 ; Y 3 ) - I{Y 2 -Y 2 \X^ X 2 ,Y 3 )} (10) 



October 18, 2010 DRAFT 



5 



where supremum is still taken over all joint probability distributions of the same form as in (9). 
The supremum of rates achievable by superimposing DF and CF [2, Theorem 7] is 

R SF = sup(min{/(X i; Y 3 , Y±\X 2 , U) + I(U; Y 2 \X 2 , V), 

I(X U X 2 - Y 3 ) - I(Y 2 '; Y 2 \U, X u X 2 , Y 3 )}) (11) 

where the supremum is over all joint probability distributions of the form 

p(u, v, xi, x 2 , y 2 , y 3 , m) = p{v)p{u\v)p{x 1 \u)p(x 2 \v)p(y 2 , y 3 \x u x 2 )p(y 2 \x 2 , y 2 , u) (12) 
subject to the constraint 

I{X 2 - Y 3 \V)> I{Y 2 '; Y 2 \X 2 , Y 3 , U). (13) 
Finally, the rate is upper bounded by the cut-set bound 

R cs = sup min{/(X 1 , X 2 ; Y 3 ), J(X i; Y 2 , Y 3 )}, (14) 
where the supremum is taken over all possible distributions p{x 1 ,x 2 ). 

III. Broadcast over Decode and Forward 

Before investigating the coding scheme that superimposes CF and DF for the Gaussian relay 
channel, we will first look at a simpler coding scheme. In this scheme, partial information is 
decoded first at both the relay and the destination terminals like in a broadcast channel. The 
remaining message is decoded and forwarded given the partial information available at the relay 
and destination terminal. The coding scheme is equivalent to superimposing broadcast over 
decode and forward. 

We split the message W into two parts W and W" with respective rates R' and R". We 
demand W be decoded at both relay and destination. The relay also decodes the message W" 
which the destination could not decode and sends this extra information to the destination in 
a block Markov encoding fashion. This strategy can be designed using an auxiliary random 
variable U and a block Markov superposition encoding explained below. 
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Theorem 1: For any relay channel (X\ x X 2) p(y 2) 2/3^1, x 2 ), y 2 x the rate i? is achievable 
where 

R < sup{min{/(f/; F 3 ), I(U; Y 2 \X 2 )} + min{/(X i; F 2 |X 2 , U), I(X ± , X 2 ; Y 3 \U)}} (15) 
p 

and the supremum is taken over all probability distribution functions of the form 

p(u, x ± , x 2 , y 2 , y 3 ) = p(u)p(x 2 )p(x 1 \x 2 , u)p(y 2 , y 3 \xi,x 2 ). 
Proof of Theorem 1: 

a) Codebook Generation: Encoding is performed in K + 1 blocks. For each block k, 
generate 2 nR ' codewords u k (s),s = 1, 2, . . . , 2 nR ' by choosing the u ki (s) independently using 
the distribution Pu{-)- Generate 2 nR " codewords x 2k (t),t = 1,2, . . . , 2 nR " by choosing x 2ki (t) 
independently using the probability distribution Px 2 {)- Now use superposition coding and gener- 
ate 2 nR " codewords x™ k {r\s, t), r — 1, 2, . . . , 2 nR " for every pair of (u k (s), x 2k (t)), by choosing 
the xi kti (r\s,t) independently using P( Xl \x 2 ,u){-\u Ki {s),x 2hi (t)). 

b) Encoding: Let s k be the message index of W and t k be the message index of W" 
respectively to be sent in block k. The source encoder then transmits x r { k (t k \s k , t k -i) where t k -i 
is the index of W" sent in the previous block. The relay in block k will send x 2k (i k -i), where 
4-i is the estimate of tk-i at the relay. 

c) Decoding at relay terminal: Assume that decoding of s fe _i and t k _i in block k — 1 has 
been successful. Upon receiving y 2k in block k, the relay looks for a unique s k such that 

A(4-i),y 2 n fe ) e T?(Pu, X2 , Y2 ). 

Having decoded s k , the relay now looks for a unique i k such that 

(^ fe (4|4,4-i),< fc (4),^2fe(4-i),y2jt) e t?(P UjXuX2jY2 ). 

d) Decoding at the sink terminal: Upon receiving y£ fc , the destination terminal looks for 
a unique s k such that (ui k (s k ),y 3k ) € T e n (Pt/y 3 ). Now, the destination decodes the additional 
information that the source sends in a block Markov decoding fashion. The destination terminal 
tries to find a unique 4-i such that (x% k (t k -i), u^ k (s k ), y 3k ) e ^"(^,x 2 ,y 3 ) and 

K fc (4-i|s fe -i, 4- 2 ), < fc (Sfc-i), 4fe(4- 2 ), j/? (fc _i)) G T^^Vi,^)- 
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e) Rate analysis: At the relay, since we have a single user channel from U to Y 2 , we will 
be able to decode the U codewords with low probability of error if R' < I(U;Y 2 \X 2 ). We can 
also decode the index t k if 

R" <I(X 1 ;Y 2 \U,X 2 ). 

The destination first decodes the codeword U with a low probability of error provided R' < 
I(U;Y 3 ), and then decodes the message t k using successive interference cancellation on the 
messages from the relay and the source. The message would be decoded with low probability 
of error provided 

R" < I(X 2 ; Y 3 \U) + I(X 1 ; Y 3 \X 2 , U). 

Combining all the bounds, the desired result (15) follows. ■ 
In this scheme, the source message is split into two parts. The message W is broadcast to both 
relay and destination. And the other message W" is decoded by relay first and then cooperatively 
transmitted to the destination. Unfortunately, the above achievable rate does not outperform the 
DF strategy, as is shown below: 

R < min{/([/; Y 2 \X 2 ) + I(X 1 ; Y 2 \X 2 , U), I(U; Y 3 ) + I(X 1 X 2 ; Y 3 \U)} (16) 
= min{/([/, X i; Y 2 \X 2 ), I(X 1} X 2 ; Y 3 )} (17) 
= mm{I(X 1 ;Y 2 \X 2 ),I(X 1 X 2 ;Y 3 )}. (18) 

where (18) follows from the Markov chains U — X\ — Y 2 and U — X\ — Y 3 . But (18) is the rate 
achieved by the Decode and Forward strategy. 

Although not providing a higher rate, the above proposed scheme of broadcast over decode 
and forward gives us a good insight on the superposition strategy. The cause of suboptimality 
arises due to the fact that the messages W' and W" even though are generated from the same 
source, act as interference on each other. This limits the rate of decoding at the relay and 
destination terminals. This interference would also be present if we superimpose DF and CF. 
The rate achievable using the superposition strategy is investigated in the next section for the 
case of Gaussian relay channels. 
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IV. Achievable Rate of the Superposition Scheme 

In this section, we focus on the Gaussian relay channel. We show that when considering only 
jointly Gaussian distribution for all the random variables involved in (11), superposition does 
not offer higher rate than DF or CF alone. To be more specific, we will show that when all the 
random variables involved are Gaussian, then R S f < max(R DF , R CF ). Trivially, only one of 
two cases can be true 

1) Case A: R DF > R CF ; 

2) Case B: R CF > R DF . 

It is then enough to show that in Case A, R SF < Rdf', and in Case B, R SF < Rcf- 

A. Gaussian distribution assumption 

We assume that all random variables in (11) are zero mean and jointly Gaussian distributed. 
The distribution will then depend only on the variances and the cross-correlations of the random 
variables. For two generic random variables X and Y, let 

__ E{(X-E[X])(Y-E[Y])} 
,¥ ' ^E[X 2 ]E[Y 2 } 
denote the correlation coefficient between them. The following lemma is useful in deducing 

correlations from known ones. 

Lemma 1: Let X — Y — Z be a Markov chain of jointly Gaussian random variables. Then 

4>x,z = 4>x,y4>y,z- 

Proof: See appendix. ■ 
Returning to the random variables involved in Rsf, we denote a = 4>u,v, P = 4>v,x 2 , and 
7 = • Using Lemma 1 , we obtain from the Markov chain U — V — X 2 that 

5 := (f>x u x 2 = 4>v,u ■ 4>v,x 2 = aP, (19) 

and from the Markov chain X 1 — U — X 2 that 

P ■■= <t>x u x 2 = </>x u u ■ <Pu,x 2 =l5 = a/3>y. (20) 

Fig. 2 shows the correlation between the random variables along with their dependencies on 
each other. 
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B. Main Result 

The main result is stated in the following theorem. Two lemmas that are needed in the proof 
are stated and proved in the appendix. 

Theorem 2: Let (X 1: X 2 , Y 2 , Y 3: Y 2 , Y 2 , U, V, ) be a set of jointly Gaussian random variable 
whose joint distribution can be factorized in the following form: 

p(u, v, xi, x 2 , 2/2, 2/3, y' 2 ) = p{v)p{u\v)p(x 2 \v)p{x 1 \u)p(y 2 , y 3 \xiX2)p(y' 2 \y2,u, x 2 )p(y 2 \y 2 , x 2 ), 

(21) 

where p(y 2 , ys\xix 2 ) is as given in (4). Let V denote the class of distributions specified by (21). 
Let V' denote a subset of V with distributions that also satisfy the constraint (13). We have 

sup min{/(X i; Y 3 , Y 2 '\X 2 , U) + I(U; Y 2 \X 2 , V),I(X 1 , X 2 ; Y 3 ) - I(Y 2 '; Y 2 \X U X 2 , U, Y 3 )} (22) 
v 

= m a x{supmin{/(X 1 ;F 2 |X 2 ),/(X 1 ,X 2 ;F 3 )}, (23) 
v 

sup min{/(X i; Y 2 , Y 3 \X 2 ), I(X U X 2 ; Y 3 ) - I(Y 2 ; Y 2 \X,,X 2 , Y 3 )}}. (24) 
v 

Proof: The rates appearing in (22)-(24) are Rsf, Rdf, and Rcf, respectively. Since through 
the judicious choice the random variables U and V, DF and CF can be cast as special cases 
of SF [2], we have Rsf > Rdf and Rsf > Rcf- It is then sufficient to show that R S f < 

m&x(R DF ,R CF ). 

Under the Gaussian assumption, the compressed version Y 2 of Y 2 in (12) can be written as 

Yi = Cl Y 2 + c 2 U + c 3 X 2 + Z' w (25) 

where ci, c 2 , c 3 axe constant parameters, Z' w is Gaussian and independent of Y 2 , U, and X 2 . Since 
in both (11) and (13), the three mutual information terms involving F 2 ', namely, 

I(X 1 ;Y 3 ,Y^\X 2 ,U), I(Y^Y 2 \X 2 ,X 1 ,U,Y 3 ), I(Y^;Y 2 \X 2 ,U,Y 3 ) 

are all conditioned on U and X 2 , the coefficients c 2 and c 3 do not affect the values of these 
terms. Therefore we can set c 2 = c 3 = 0. It is also true that scaling Y 2 by a constant does not 
change any of the terms. So unless C\ = 0, we can assume c\ — 1, as we do in the following. 
The case c\ = is known as the so called partial decoding and forward scheme, which is known 
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to be inferior to the full DF scheme [4]. We denote the variance of Z' w as A'. The amount of 
compression, which is controlled by the parameter A', depends on the constraint (13) imposed 
by the relay link channel and the encoding scheme at the relay. In summary, we can take without 
loss of generality 

Y 2 ' = Y 2 + Z' w , (26) 

The following is a broad outline of the proof. Given any rate achieved by the SF scheme, we 
can find a CF scheme or a DF scheme which can achieve a rate higher than or equal to SF. The 
Y 2 for the CF scheme is set to be statistically equal to Y 2 of the SF scheme in (26): 

% = Y 2 + Z w , (27) 

where Z w is zero mean Gaussian with variance A = A'. Such Y 2 would qualify as the compressed 
version of Y 2 in CF. This choice of Y 2 is enough to achieve a higher rate than SF even though 
it can be suboptimal to the possible rates achievable by CF. 
First, we have 

J(Y£ Y 2 \X U X 2 , U, Y 3 ) = h(Y 2 \X 1 , X 2 , U, Y 3 ) - h(Y 2 \X 1 , X 2 , U, Y 3 , Y 2 r ) (28) 

= h(Y 2 \X u X 2 ,Y 3 ) - h<y 2 \X u X 2 ,U,Yz,Yl) (29) 

> h(Y 2 \X u X 2 , Y 3 ) - h(Y 2 \X 1 , X 2 , Y 3 , Y 2 ') (30) 

> h(Y 2 \X u X 2 , Y 3 ) - h(Y 2 \X u X 2 , Y 3 , Y 2 ) (31) 
= I(Y 2 ;Y 2 \X U X 2 ,Y 3 ) (32) 

where (29) is due to the Markov chain U — (Xi, X 2 , Y 3 ) — Y 2 ; (30) uses the fact that conditioning 
does not increase entropy; and (31) is because given (X 2 , U), Y 2 ' is statistically equivalent to Y 2 . 
Thus, we have shown 

I(X l7 X 2 ;Y 3 ) -I(Y 2 ';Y 2 \X U X 2 ,U,Y 3 ) < I(X U X 2 ; Y 3 ) - Y 2 \X 1 , X 2 , Y 3 ). (33) 
It then remains to be shown that 

J(X i; Y 3 , Y 2 '\X 2 , U) + I(U; Y 2 \X 2 , V) < max{/(X i; Y 2 \X 2 ), J(X i; Y 2 , Y 3 \X 2 )}. (34) 
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Depending on which one of the two terms on the right hand side is bigger, we have two cases. 



In the first case, 

I(X i; Y 2 \X 2 ) > I(X 1 ;Y 3 ,Y2\X 2 ) (35) 

and we have 

I(U; Y 2 \V, X 2 ) + I(X 1 ; Y 3 , Y{\X 2 , U) (36) 

= I(U; Y 2 \V, X 2 ) + I(X 1 ; Y 3 , Y 2 \X 2 , U) (37) 

= I(U; Y 2 \X 2 ) - I(V; Y 2 \X 2 ) + I(X 1 ; Y 3 , Y 2 \X 2 , U) (38) 

= J(X i; Y 2 \X 2 ) - J(X i; Y 2 \X 2 , U) - I(V; Y 2 \X 2 ) + J(X i; Y 3 , Y 2 \X 2 , U) (39) 

< J(X i; Y 2 \X 2 ) - J(X i; Y 2 , Y 3 \X 2 , U) - I(V; Y 2 \X 2 ) + J(X i; Y 3 , Y 2 \X 2 , U) (40) 

= I(X 1 ;Y 2 \X 2 ) - I(V;Y 2 \X 2 ) (41) 

</(X i; F 2 |X 2 ) (42) 



where (37) follows by our choice of Y 2 to be statistically the same as Y 2 '\ (38) follows from the 
Markov chain V - (U, X 2 ) - Y 2 ; (39) follows from the Markov chain U - (X U X 2 ) - Y 2 ; (40) 
follows from (35) and Lemma 2, which is stated and proved in Appendix B; and (42) follows 
from the fact that mutual information is nonnegative. 
In the second case, 

J(X i; Y 2 \X 2 ) < J(X i; Y 3 , Y 2 \X 2 ) (43) 

and we have 

J(X i; Y 3 , Y^\X 2 , U) + I(U; Y 2 \V, X 2 ) 

= I(X 1 ; Y 3 , Y 2 \X 2 , U) + I(U; Y 2 \V, X 2 ) (44) 

= I(X 1 ; Y 3 , Y 2 \X 2 , U) + I(U; Y 2 \X 2 ) - I(V; Y 2 \X 2 ) (45) 

< I(X 1 ; Y 3 , Y 2 \X 2 , U) + I(U; Y 3 , Y 2 \X 2 ) - I(V; Y 2 \X 2 ) (46) 
= J(X i; Y 3 , Y 2 \X 2 ) - I(V; Y 2 \X 2 ) (47) 

< I(X 1 ;Y 3 ,Y2\X 2 ) (48) 
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where (44) follows by our choice of Y 2 to be statistically the same as Y 2 ; (45) follows from 
the Markov chain V — (U, X 2 ) — Y 2 ; (46) follows from (43) and Lemma 3, which is stated and 
proved in Appendix B; (47) follows from the Markov chain U — (Xi,X 2 ) — Y 2 ,Y 3 ; and (48) 
follows from the fact that mutual information is nonnegative. 

Thus we have shown (34) holds. And the whole proof is complete. ■ 

C. Discussion 

We have shown that the SF does not outperform both DF and CF. We provide some intuitive 
explanation in the following. 

Observe from (27) that Y 2 is the quantized signal of Y 2 in the CF scheme. The variance of 

Z w is A, which in general could be different from A', the variance of Z' w in (25). From the 

constraint (8), we have A > A C f, where 

= iViiV 2 + (iVi + a 2 iV 2 )Pi 

b 2 P 2 V ' 

Although the constraint is not explicitly imposed in the formulation in (10), it can be shown 

that setting A = A C f actually maximizes the two terms on the right hand side of (10), and 

equalizes them: 

J(X i; Y 2 , Y 3 \X 2 ) = I(X U X 2 ; Y 3 ) - I(Y 2 ; Y 2 \X U X 2 , Y 3 ). (50) 
It can be verified that 

1) I((Xi,Y 2 ,Y 3 \X 2 ) is a monotonically decreasing function of A (coarser compression re- 
duces the useful information about Xi in Y 2 ); 

2) I(Y 2 ;Y 2 \Xi, X 2 ,Y 3 ) is a monotonically increasing function of A. 

Therefore the minimum of the two functions is maximized at their crossing point, which happens 

at A = A C f- In other words, for CF, within the relay-destination link rate limit I(X 2 ; Y 3 ), more 

compression yields higher rate over all. For the SF, however, the situation is different. The 

parameter A', which controls the amount of compression in (25) needs to be chosen to satisfy 

the constraint (13). In particular, we have A' > A S f, where 

_ (N 2 + P 1 (1- aV))(iV 1 iV 2 + (N, + a 2 N 2 )P 1 (l - 7 2 )) 
SF b'P 2 (l-^)[N 2 + P 1 (l-^)] { } 
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In general A S f can be less than A CF ; e.g., when 7 > 0, a = 1 and /3 — 0. In contrast to 
the CF case, it is not true for SF that more compression (smaller A') necessarily yields higher 
rate. The intuitive reason is that the relay has two messages to transmit to the destination: the 
partially decoded message carried by U and the compressed version of Y 2 carried by Y 2 \ Although 
reducing A' will provide to the destination a more faithful representation of Y 2 , and enlarge the 
term I(Xi, Y 3 , Y 2 ' \X 2 , U) + I(U ; Y 2 \X 2 , V), it will reduce the relay's ability to cooperate with 
the source through the message U, and hence enlarge the gap IiY^] Y 2 \Xi, X 2 , U, Y 3 ) from the 
multiple-access cut- set bound I(X\, X 2 ;Y 3 ), which then becomes the rate limiting factor. The 
optimum amount compression turns out to the be same as in the CF case. And superposition 
of DF and CF does not help the rate, which agrees with the observation that we have made in 
Section III. 

Finally, we remark that in our proof we did not use the constraint (13). So it is true that for 
the Gaussian distribution, even without the constraint, the SF does not result in a rate that is 
higher than the larger one of R DF and Rcf- 

V. Numerical Result 

Considering an example Gaussian relay channel such that the source and the destination are 
separated by a unit distance, and the relay is at distance d from the source and 1 — d from the 
destination. The channel gain between any two nodes is inversely proportional to their distance. 
So a = 1/d and b = 1/(1 —d). The additive noises at the relay and the destination are independent 
but have the same variance Ni — N 2 — 1. The transmit powers are set to Pi = P 2 = 5. 

Fig. 3 shows the numerical rates achievable by DF, CF and the cutset bound (14) as a function 
of distance d of the relay from the source terminal. Depending on d, there are three cases: 

1) When d is small (roughly d < 0.2), DF is optimal. The rate achieved by DF is equal 
to I(Xi,X 2 ; Y 3 ) the multiple-access cut-set bound. The reason is that the source message 
can be fully decoded at the relay. 

2) For medium d (roughly 0.2 < d < 0.6), DF is not optimal, but still performs better than 
CF. In this case, the rate of DF is dominated by I(X 1 ,Y 2 \X 2 ), the amount information 
can be decoded at the relay, which dictates the amount of cooperation possible between 
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source and relay. In this region, the relay-sink channel is "poor" so that sending "finely" 
compressed version of Y 2 is not possible. 
3) For large d (roughly 0.6 < d < 1), CF out performs DF. In this region, the ability of the 
relay to decode the source is weak, and it is more fruitful to send compressed version of 
the relay's observation. Only in the extreme case, d — 1, does CF actually achieve the 
cut-set bound. 

The rate achievable by superimposing DF and CF given by (11) is numerically compared with 
the rates achieved by CF, DF and the cut-set bound. The mutual information terms of (11) are 
evaluated for the choice of appropriate Gaussian Random variables, according to (59) and 



The constraint I(Y 2 ;Y 2 \U, X 2 ,Y 3 ) < I(X 2 ;Y 3 \V) is evaluated to A' > A SF , where A SF is as 
given in (51). The correlation terms a, (3, 7 and the variance A' are optimizing parameters, which 
control the amount of information that is decoded and the amount that is compressed. When 
all the parameters have been optimized within the constraint posed by (51), the SF is found to 
achieve the maximum of R DF and Rcf, as shown in Fig. 4. 



We analyzed the coding strategy of superimposing CF and DF for the Gaussian relay channel. 
We note that superposition of CF and DF does not provide higher achievable rates than the 
individual DF and CF for the Gaussian case. We conclude that we should look for new strategies 
different from superposition strategy, or look for non-Gaussian distributions for the superposition 
scheme, or try to find tighter upper bounds than the cut-set bound. 




(53) 



(52) 



(54) 



VI. Conclusion 
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Appendix 

A. Proof of Lemma 1 

Proof: Assume without loss of generality that all three random variables are zero mean. 
We have 

E[XZ] 



<f>x,z = 



E{E[XZ\Y]} 



E{E[X\Y]E[Z\Y]} 
VE[X2] eW\ 

EWE[X^}/E[Y^ Y Y ■ ^E[Z*}/E[Y^ Z Y} 

4>X,Y<t>Y,Z 



B. Two lemmas needed in the proof of Theorem 2 

We prove two lemmas in the following that will be useful in the proof of Theorem 2. Lemma 2 
is used in the case Rdf > Rcf- Lemma 3 is used in the case R DF < Rcf- 

Lemma 2: Let (X 1: X 2 , Y 2 , Y 3 , Y 2 , U, V) be jointly Gaussian random variables with joint dis- 
tribution p(u, v, xi, x 2 , y 2 , U3, m) = p(t>)p(w|w)p(x 2 |v)p(xi]u)p(y2,y3ki2:2)p(y2|y2,2;2), where 
p(y 2 :ys\xux 2 ) is as given in (4). If I(X i; Y 2 \X 2 ) > I{X i; Y 3 ,Y 2 \X 2 ) then I(X i; Y 2 \X 2: U) > 

/(x i; y 2 ,y 3 |^2,c/). 

Proof: Under the Gaussian assumption, we have 

J(X 1 ; y2| X 2 ) = llog{l + n2P ' ( ; i -^ } (56) 

Y 2 \X 2 , U) = \ log |l + " 2Pl{1 N ~ l2) ] (57) 
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Obviously when p — 1 and hence 7 = 1 (because p = a/3^y), the lemma holds. We thus 
assume that p < 1. Since /(X^ Y 2 \X 2 ) > I{X X \ Y 2 , Y 3 \X 2 ), from (56) and (57) we have 

iVi - Pl(1 " P) (iV 1 + A)iV 2 • (60) 

Multiplying both sides with (1 — 7 2 )/(l — p 2 ), we obtain 

iVi " Pl(1 " 7) (iV 1 + A)iV 2 • (61) 

It then follows that /(Xi; Y 2 \X 2 , U) > I(Xi, Y 2 , Y 3 \X 2 , U) from the monotonic property of the 
logarithmic function. ■ 

Lemma 3: Let (X±, X 2 , Y 2 , Y 3 , Y 2 , U, V) be jointly Gaussian random variables with dis- 
tribution p(u,v,x 1 ,x 2 ,y 2 ,y 3 ,y 2 ) = p(v)p(u\v)p(x 2 \v)p(x 1 \u)p(y 2 ,y 3 \x 1 , x 2 )p(y 2 \y 2 , x 2 ), where 
p(y2,ys\x 1 ,x 2 ) is as given in (4). If I(X 1 ;Y 2 \X 2 ) < I(X 1 ;Y 2 ,Y 3 \X 2 ) then I(U;Y 2 \X 2 ) < 
I(U;Y 2 ,Y 3 \X 2 ). 

Proof: Under the Gaussian variable assumptions, we have 

Z ( X i; y 2 |X 2 ) = ilog(l + g^%^m (62) 



2 [ Ni 

It can be verified that when 7=1, I(Xr,Y 2 \X 2 ) = I(U;Y 2 \X 2 ) and I(X i; Y 2 , Y 3 \X 2 ) = 
I(U;Y 2 ,Y 3 \X 2 ), so that the desired result holds in this case. In the following, we assume that 
7 < 1, and therefore p = a(5^j < 1. 

Since I(Xi, Y 2 \X 2 ) < I(Xi, Y 2 , Y 3 \X 2 ), it follows from (62) and (63) that 

a 2 A(l -P 2 ) < Pi(l-p 2 )[(iy 1 + A) + a 2 iV 2 ] 

iVi " (iVi + A)iV 2 ' 1 

Multiplying both sides of (66) with (1 — 7 2 )/(l — p 2 ) we obtain 

a 2 Pi(l -I 2 ) Pi(l-7 2 )[(iVi + A) +a 2 iV 2 ] 

Wi " (iVi + A)iV 2 ' 1 
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Adding the numerator to the denominator on both sides, we obtain 

cPPiil - 7 2 ) . P 1 (l- 7 2 )[(iVi + A) + a 2 iV 2 ] 

^ /at- , auf i FT/i ou/M , an i 5T77- ( 68 ) 



iVi + a 2 Px(l - 7 2 ) " (iVx + A)iV 2 + P 1 (l - 7 2 )[(iVi + A) + a 2 N 2 
Multiplying both sides of (68) by (7 2 — p 2 )/(l — 7 2 ), we obtain 

a 2 P 1 ( 7 2 - P 2 ) , P 1 ( 7 2 -p 2 )[(iVi + A) + a 2 iV 2 ] 



JVi + a 2 Px(l - 7 2 ) - (JVx + A)A^ 2 + P^l - 7 2 )[(iV 1 + A) + a 2 A^ 2 
It then follows that I(U;Y 2 \X 2 ) < I(U;Y 2 ,Y 3 \X 2 ) due to the monotonic property of the 
logarithmic function. ■ 
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Fig. 1. Gaussian relay channel 




Fig. 2. Dependency graph of random variables with correlation coefficients 



Achievability Rates for Gaussian Relay Channel 
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Fig. 3. Achievable rates for the Gaussian relay channel, where d is the normalized distance from source to relay. 
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Achievability Rates for Gaussian Relay Channel 
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Fig. 4. Achievable rates for Gaussian relay channel. The parameters of the superimposing strategy are optimized to maximize 
the achievable rate 
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